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SINGLE POINT INTEPvACTION SCREEN TO PREDICT IC 50 
Cross-Reference to Related Application 
This application claims priority of U.S. provisional application number 
60/193,717, filed March 31, 2000. 
5 Field of the Invention 

This invention relates to an improved method for determining the effect of new 
chemical entities on biologically active proteins. In particular, this invention relates to 
an improved method for determining the potential for drug-drug interaction involving 
cytochrome P450s (CYP) with new chemical entities. 

10 

Background of the Invention 
Unfavorable drug-drug interactions (DDI) are only responsible for 
approximately 1-2% of clinically relevant DDI (Fuhr et a/., 1996), but are still an 
important factor in determining whether a new chemical entity will successfully make 

15 it beyond a drug discovery program to development. In addition, the late discovery of 
a clinically significant drug-drug interaction is costly in terms of the financial 
investment in a particular project. Therefore it is important to screen for potential 
interactions (FDA, 1997; Wrighton and Silber, 1996) early on as well as select the 
most appropriate in vivo studies (FDA, 1998; Tucker, 1992). This rapid determination 

20 of clinical viability or fast efficient killing of compounds has been commented on 
previously (Miwa, 1995). In this regard drug interactions with cytochrome P450s 
(CYPs) are particularly important (Lin and Lu, 1997), but ultimately avoidable 
(depending on the therapeutic index), even though drug metabolism is complex and 
to some extent predictable by careful assessment of structure activity relationships. 

25 CYP1A2, CYP2C9, CYP2C19, CYP2D6 and CYP3A4 represent greater than 90% of 
total hepatic P450 (Shimada et a/., 1994) and nearly 80% of therapeutic drugs are 
metabolized by these same enzymes (Smith et ai, 1998). Interaction with one or 
more of these enzymes in vivo would thus pose a potentially clinically relevant event. 
In recent years, in vitro systems have proved invaluable in predicting the likelihood of 

30 DDI (Lin and Lu, 1997) as they allow identification of the CYPs responsible for 
metabolism as well as determination of the relative contribution to overall elimination 
of the inhibited pathways (Lin, 1998). Since the number of molecules synthesized by 
pharmaceutical companies has dramatically increased with the utilization of 
combinatorial chemistry, there is now a shift in emphasis towards earlier 
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implementation of higher throughput in vitro studies for metabolism (Moody et al., 
1999; Rodrigues, 1997) or lead optimization (Tarbit and Berman, 1998). The 
prediction of drug-drug interactions of new chemical entities (NCEs) using in vitro 
methods, such as human liver microsomes, hepatocytes (Pichard et al., 1990) or 
5 individual expressed CYPs has escalated both in importance and scale of use, as 
one way to reliably avoid potential interactions in vivo (Tucker, 1992). However, 
occasionally there may be discrepancies when comparing predictions in each system 
(Lin, 1998) as ultimately metabolism in vivo is complicated by the role of transport 
processes (Ito etal., 1998). 

10 As far as the technologies available for determining DDI in vitro, groups are 

now utilizing multiwell well plates, multiwell pipetting, column switching, automation 
and mass spectroscopy (MS) (Korfmacher et al., 1997) which are revolutionizing 
sample throughput for drug metabolism and are easily applicable. A number of 
groups have looked to radiolabeled substrates as a means of increasing the speed of 

15 screening for CYP interactions without the need for high performance liquid 
chromatography (HPLC) or MS detection (Hopkins et a/., 1998; Moody et al., 1999; 
Rodrigues et al., 1997). However, this technique has the considerable disadvantage 
of creating low level radioactive waste for disposal, which in some locations may be 
both costly and undesirable. In contrast, following on from the use of fluorescent 

20 probes with whole cells cultured in 96 well plates (Donato et al., 1993), others have 
looked at fluorescent probes and plate reader technologies with expressed CYPs 
(Crespi et al., 1997; Crespi et al., 1998). Ultimately, these methods are severely 
limited due to potential interference of the non optimal spectral characteristics of the 
fluorescent substrate, requirement for increased levels of expressed enzymes and 

25 the frequent observation of activation with some substrates and inhibitors 
(Mankowski, unpublished observation). There are also some important limitations 
with in vitro incubations which should be considered (Bertz and Granneman, 1997; 
Ekins et al., 1998a; Ekins et al., 1998b; Maenpaa et al., 1998). Perhaps the most 
significant is the effect of organic solvents which has been widely evaluated (Busby Jr 

30 et al., 1999; Chauret er al., 1998; Hickman et al., 1998), hence the solvent volume 
added to the incubation is ultimately restricted. We are then left with more efficiently 
using the multiwell plate, liquid handling and analytical detection technologies already 
available to us. As the speed and sensitivity of analytical detection using MS has 
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increased, allowing very short run times (less than a minute), this no longer provides 
a bottleneck. 

Another way to optimize the throughput of current CYP interaction screens has 
already been addressed by some groups, namely the possibility of using fewer 
5 inhibitor concentrations in order to fit more determinations on a 96 well plate (Moody 
etal., 1999; Wynalda and Wienkers, 1997). Although in some cases correlations 
have been presented to validate this direction (Moody etal., 1999), there has been 
little or no discussion of the relevance or utility of single point screening for CYP 
interaction studies. Likewise, there is some indecision as to the number of 

10 determinations for each sample e.g. use of duplicates or single points, as well as the 
number of controls used. In future, the importance of screening for DDI with all CYPs 
and other enzymes involved in drug metabolism will be high and automation of all 
liquid handling, incubation and analytical techniques will leave the emphasis on the 
number of samples obtainable per plate. Beyond the logistics of the experiment itself 

1 5 is the use of techniques such as statistical experimental design, whereby 

computational approaches are used to determine the nature of inhibition as well as 
estimating kinetic constants like Kj and IC 50 (Bronson et al., 1995; Lutz et al., 1996). 
By determining signal windows for high throughput screens, another group has 
shown the benefits of using compounds in single wells as opposed to duplicates, 

20 which would naturally double the number of compounds on a multiwell plate and 

positively affect the cost and time of screening (Sittampalam et al., 1997). With these 
considerations in mind, the present study evaluates 10 or 3 point CYP IC 50 screening 
and the potential for using a single point interaction screen to predict IC50 for DDI 
studies. This would be a means of optimizing the present 96 well plate technology to 

25 the fullest extent and achieve the goal of these updated CYP interaction screens. 
Ultimately, high throughput DDI screens will provide data to generate and validate 
predictive CYP computational models (Ekins etal., 1999a; Ekins etal., 1999b). 

Decreasing the time and resources to screen experimental drug compounds 
results in the ability to screen more compounds for a given investment, and increase 

30 the chances of discovering commercially viable compounds. One method for 

decreasing the time and resources for biological screening is to reduce the number of 
concentrations used in standard assays. Biological activity, as measured by IC 50 , the 
concentration of compound that results in 50% of the maximal inhibition, or EC 50 , the 
concentration that results in 50% of the maximal effect, can be determined by a 
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mathematical relationship learned after an initial period of screening at multiple 
concentrations. Screening at a single concentration increases the number of 
compounds that can be screened, and also reduces the quantity of compound that 
must be prepared for screening, saving on preparatory resources. For many assays 
5 that we have studied, screening at a single concentration results in equivalent 
precision of the IC 50 value to that obtained by screening at multiple concentrations. 
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Summary of the Invention 

This invention provides a method for routine determination of the IC 50 values for 
compounds via biological assay at a single concentration. 
5 The term "IC 50 " means the concentration of a test compound which produces 

50% of the maximal inhibition on a biologically active protein target. "IC 50 " as used 
herein includes "EC 50 ", the concentration of a test compound which produces 50% of 
the maximal effect on a biologically active protein target. As used herein, the term 
"target" means a biologically active protein. Targets include cytochromes P450, 

10 enzymes, receptors, and transporters. 



This invention provides a method for routine determination of IC 50 values for 
compounds via biological assay at a single concentration, which comprises: 

a. ) Developing or identifying a biological assay capable of producing a 
percent inhibition or percent effect for a compound tested at a known concentration; 

b. ) Performing the assay on an initial collection of at least 10 compounds, and 
at least 1 commercially available compound to be used as positive control, 
eachassayed at a set of 3 to 10 or more concentrations, measuring a percent 
inhibition or percent effect at each concentration for each compound; 

c. ) Determining an IC 50 for each of these initial compounds and the positive 
control compound(s), by fitting a mathematical dose response curve, such as the Hill 
function, 

100 



percent inhibition = 



^concentration J 



to the data for each compound, using a computer, and standard linear or nonlinear 
regression techniques; 

d) Using the resultant data from these initial compounds to fit a 
mathematical relationship between the IC50 values and the percent inhibition values at 
a single fixed concentration X, 

IC 50 = f(percent inhibition) in general, or, for example, 
IC 50 = exp{a + b •(percent inhibition at concentration X)}, 

e) Using a computer, and standard linear or nonlinear regression techniques, 
resulting in an equation relating IC 50 to percent inhibition or percent response on all 
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remaining and future test compounds, at the previously fixed single concentration X, 
and determining the IC 50 via the mathematical equation developed in step d). 

This invention further provides a method for determining IC 50 values for drug- 
drug interactions related to cytochrome P450 (CYP). 
5 This invention further provides a method wherein said CYP is CYP2C9. 

This invention further provides a method wherein said CYP is CYP2D6. 

This invention further provides a method wherein said CYP is CYP3A4. 

This invention further provides a method wherein X is 3. 

This invention further provides a method wherein X is greater or less than 3. 
1 0 This invention further provides a method wherein said CYP is CYP1A2. 

This invention further provides a method wherein said CYP is CYP2C19. 

This invention further provides a method wherein said CYP is recombinant 
CYP2D6. 

This invention further provides a method wherein said CYP is replaced by any 

15 enzyme. 

This invention further provides a method wherein said CYP is replaced by any 
human, mammalian, plant, fungal, bacterial or insect derived enzyme. 

This invention further provides a method wherein said CYP is replaced by any 
human, mammalian, plant, fungal, bacterial or insect derived transporter. 
20 This invention further provides a method wherein said CYP is replaced by any 

human, mammalian, plant, fungal, bacterial or insect derived receptor. 

This invention further provides a method wherein said CYP is produced by 
molecular biology techniques and expressed in human, animal, insect, fungal, 
bacterial, yeast or viral cells. 
25 This invention further provides a method wherein said CYP is replaced by any 

human, mammalian, plant, insect, fungal, yeast, bacterial or viral derived enzyme, 
transporter or receptor produced by molecular biology techniques and expressed in 
human, animal, insect, fungal, yeast, bacterial or viral cells. 



30 



Brief Description of the Drawings 
Figure 1. Comparison of 10 point and 3 point IC 50 [uM] for sample proprietary 
compound data. For experimental details see Examples. 
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Figure 2. Relationship between log 10 (IC 50 ) [|aM] and percent inhibition at 3 |aM for 
proprietary compound data. For experimental details see Examples. 

Figure 3. Comparison of positive control compounds and proprietary compound data. 
For experimental details see Examples. 

Figure 4. Regression models of log 10 (lC 50 ) [uM] vs. percent inhibition at 1 , 3 or 10 uM. 
For experimental details see Examples. 

Figure 5. Reference distribution of the randomized T based on 1000 numbers 
generated from the Monte Carlo simulations under the null hypothesis, and the 
observed T value. 

Figure 6. One point predicted IC 50 [u.M] using percent inhibition at 3 u.M vs. 10 or 3 
point IC50 [{J.M] on the test set. For experimental details see Examples. 

Figure 7. Regression model with percent inhibition at 3 ^iM and 95% prediction 
interval and all but four sample proprietary compound data for the three screenings. 
For experimental details see Examples. 

Figure 8. Regression model for positive control compounds. For experimental details 
see Examples. 

Figure 9. Regression model with percent inhibition at 3 \M for CYP1A2 data. For 
experimental details see Examples. 

Figure 10. Regression model with percent inhibition at 3|aM for CYP2C9 and 
CYP2C19 data. For experimental details see Examples. 

Figure 1 1 . Regression model with percent inhibition at 3\M for recombinant CYP2D6 
data. For experimental details see Examples. 

Figure 12. Final regression model with percent inhibition at 3 jiM for all CYP data. For 
experimental details see Examples. 
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Detailed Description of the Invention 
Drug-drug interactions involving cytochrome P450 (CYP) are an important 
factor in whether a new chemical entity will survive through to the development stage. 
Therefore, the identification of this potential as early as possible in vitro, saves 
considerable future unnecessary investment. In vitro CYP interaction screening data 
for CYP2C9, CYP2D6 and CYP3A4 was analyzed to determine the correlation of 10 
and 3 point determinations (r 2 = 0.98, Figure 1). Following this we investigate 
whether a single point could also be predictive of IC 50 . We found that the IC 50 value 
could be predicted by a single value of percent inhibition at either 10, 3 or 1 jaM. This 
enables determination of more IC 50 values on a multi-well plate and results in more 
economical use of compounds. Statistical analysis of proprietary compound data for 
CYP2C9, CYP2D6 and CYP3A4 showed that there is a strong linear relationship 
between log 10 (IC 50 ) and percent inhibition at 3 uM (r 2 = 0.90) and that it is possible to 
predict a compound's IC50 value by the percent inhibition value obtained at 3 u.M. The 
95% prediction boundary for this is roughly ± 0.3 on log™ scale which is comparable 
to the variability of in vitro determinations for positive control IC 50 data (Table 1 in 
Example 8). More data (for CYP2C19, CYP1A2 and recombinant CYP2D6) were 
obtained which enabled the model to be updated. The final model is described in 
detail below. The use of a single inhibitor concentration would offer the opportunity to 
drastically speed up screening for CYP interactions, which is important with the 
challenges provided by combinatorial chemistry generating orders of magnitude more 
new chemical entities. In addition, this algorithmic approach would obviously be 
applicable for other in vitro bioactivity and therapeutic target enzyme screens that 
have historically utilized multiple compound concentrations to determine IC 50 or EC 5 o 
values. 

Initially, IC 50 values for 204 data points from CYP screen CYP2C9, CYP2D6 
and CYP3A4 were available. Amongst the 204 data points, 163 were from 
proprietary compounds, and 41 were from commercially available compounds that 
were used as positive controls. The IC 50 values were generated based on percent 
inhibition at either 10 or 3 different concentrations. The 10 point IC 50 values were 
compared with 3 point IC 50 values and a high correlation was observed (r^O.98, 
Figure 1). This naturally let us to investigate whether we could reliably predict IC 50 
using fewer than 3 points, i.e. a single point screen. The 10 point IC 50 or 3 point 
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IC 50 were typically generated through the fit of a dose-response curve of a 
particular functional format. In the CYP screening, the dose-response curve used is 
the well-known Hill function which can be expressed as: 

100 

(100 - percent inhibition at x) = — - — -r- , 

l + (/C 50 /*r 

where jc is the concentration and h is the Hill parameter, which is set to -1 here. 
In cases like this, there is usually a close correlation between the IC 50 value and 
percent inhibition at a fixed concentration. For example, if the dose-response 
function used is the above mentioned Hill function, then it is possible to show that 
jq x- (percent inhibition at x~) xlh 

50 (100 - percent inhibition at 
Therefore it is also possible to find a high correlation between log 10 (IC 50 ) and 
percent inhibition at 3 \M (Figure 2). The variation seen in the plot (data does not 
all fall on a thin curve) is caused by factors such as measurement error and 
variations caused using different human liver microsome lots. This type of data can 
be analyzed by a statistical method and a mathematical model can then be built 
that describes the relationship between the variables (in this case, IC 50 and percent 
inhibition at concentration x) as well as the variations in this data. We used 
regression analysis to analyze the data and build the mathematical model. Our 
analysis was carried out using the statistical software Splus (Becker et a/., 1988). 
Regression analysis was performed as described previously (Draper and Smith, 
1981). Details of the analysis can be found in Example 3. 

During the analysis, we needed to decide whether to use percent inhibition 
at 1 hM or percent inhibition at 3 or percent inhibition at 10 nM in the model. To 
do this we used a randomization t-test proposed by H. van der Voet (van der Voet, 
1994) to compare the predictive nature of the three models. The result of the 
randomization t-test helped us to decide to use percent inhibition at 3 yM in the 
model. Further details of this procedure are described in Example 3. In general, 
statistical model selection procedures, such as the one just mentioned above, can 
be used to help select the appropriate model. 

An examination of all the initially available data (Figure 3) shows that the 
positive control compounds all fall in the low IC50 and low 100-percent inhibition at 3 
u,M region, and they mostly do not overlap with the proprietary compound data. In 
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addition the relationship between logi 0 (IC 50 ) and 100-percent inhibition at 3 [iM 
seems to follow a different slope for positive contra! compounds. This is particularly 
evident with IC 50 values less than 0.5[M. Therefore separate models were produced 
for proprietary compounds and the positive control compounds. For the positive 
control compounds, the best model was produced by using percent inhibition at 1 |liM 
as the independent variable. The slope is different from that for proprietary 
compounds. Details of this are also given in Example 3. 

Since the initial analysis, more data for CYP screening, such as CYP2C19 
and CYP1A2 and recombinant CYP2D6 (rCYP2D6) were obtained. The new data 
were added to the initial data and more analysis was performed (see Examples 4, 5, 
6). After combining all the available data, we found that there is very little difference 
among the models for individual CYP screens. We therefore decided to build one 
model for CYP1A2, CYP2C9, CYP2C19, CYP2D6, rCYP2D6 and CYP3A4 screens 
(Example 7). We used a regression model and the percent inhibition at 3 m-M as the 
independent variable in the model. As mentioned before, the data suggest that there 
should be at least two different slopes, one for very potent compounds and one for 
less potent compounds. We used a statistical method to determine how many 
different slopes there should be and where the change point should be. For the CYP 
screen data, we found that two different slopes with a change point at (100-percent 
inhibition at 3 ^M) = 17 would yield the smallest residual mean squared error. 
Therefore the model has two different slopes with a change point at (100-percent 
inhibition at 3 \M) = 17. Details are in Example 7 and Figure 12. 



Examples 

Materials. Quinidine, Isocitric dehydrogenase, DL-isocitric acid, NADP, 
ticlopidine, acetophenetidin, and diclofenac were purchased from Sigma Chemical 
Co. (St Louis, MO), Ketoconazole, (±)bufuralol, sulphaphenazole, furafylline and (S)- 
(+)-mephenytoin were obtained from Gentest Inc (Woburn, MA). 4-Androsten-17p- 
OL-3-one was obtained from Steraloids. Magnesium chloride, sodium phosphate 
monobasic and sodium phosphate dibasic were obtained from Fisher Scientific. All 
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other compounds designated were synthesized at Pfizer (Groton, CT). Solvents were 
obtained from J T Baker (Phillipsburg, NJ). 

Liver specimens and expressed enzymes. Human livers were obtained 
from the following organizations under protocols approved by the appropriate 
committee for the conduct of human research; SRI International (Menlo Park, CA), 
International Institute for the Advancement of Medicine (HAM, Exton, PA), Vitron Inc., 
(Tucson, AZ), Anatomical Gift Foundation (AGF, Woodbine, GA) and National 
Disease Research Institute (NDRI, Philadelphia). Microsomes were then prepared 
using differential centrifugation (van der Hoeven and Coon, 1974). Baculovirus 
expressed CYP2D6 was produced at Pfizer as described by Mankowski et al. 1996. 

Example 1 

ICso determinations in 96 well format - 3 and 10 point screening. Each of 
the eight rows in a standard 96-well plate was essentially a separate inhibition curve 
(for 10 point screening) or 2 inhibition curves (for 3 point screening). First, separate 
plates were prepared for the substrate and for dilutions of inhibitor. The contents of 
these two plates were then combined in a 1:1 ratio to make a master plate of 
substrate and inhibitor solutions (S/l plate). The remaining assay ingredients, a 
combination of microsomes, (10 %) NADPH generating cofactor solution: (stock 
solution: 125 mM MgCl2, 0.54 mM NADP, 6.2 mM DL-isocitric acid, 0.5 U/ml isocitric 
dehydrogenase) and buffer (100 mM sodium phosphate, pH 7.4), were prepared on 
ice and transferred to a polyvinyl reaction plate (RXN plate). Preparation of these 
plates required the use Soken 96-well pipettor (Apricot Designs Inc, Encino, CA) and 
Robbins 96-well pipettor (Robbins Scientific Corporation, Sunnyvale, CA). The RXN 
plate was preincubated to 37°C using a MJ Research Model PTC-100 automated 
thermal controller and the reaction is initiated by addition of an aliquot from the S/l 
plate. The reaction was allowed to proceed at 37°C before being terminated using 
methanol (10 uJ). HPLC or mass spec analysis is preceded by filtration of (150|ul) 
using a Millipore multiscreen-MAHA mixed cellulose esters, triton-free, non-sterile 
plate. 

Example 2 

Phenacetin O-deethylation IC50 assay (CYP1A2). Human liver microsomes 
(0.5 mg/ml protein), phenacetin (50 jaM) and proprietary inhibitors were incubated, 
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terminated and filtered as described above (Example 1). Furafylline was used as a 
positive control. 

Diclofenac 4'-hydroxylase ICgo assay (CYP2C9). Human liver microsomes 
(0.1 mg/ml), diclofenac (10 uM) and proprietary inhibitors were incubated, terminated 
and filtered as described above (Example 1). Sulfaphenazole was used as a positive 
control in place of proprietary compounds. 

(S)-(+)-Mephenytoin hydroxylase ICso assay (CYP2C19). Human liver 
microsomes (0.1 \xM P450), S+ mephenytoin (50 ^M) and proprietary inhibitors were 
incubated, terminated and filtered as described above (Example 1). Ticlopidine was 
used as a positive control. 

Bufuralol 1 '-hydroxylase ICso assay (CYP2D6). Human liver microsomes 
(0.1 |aM P450), bufuralol (10 (j,M) and proprietary inhibitors were incubated, 
terminated and filtered as described above (Example 1). Quinidine was included as a 
positive control. Alternatively, recombinant CYP2D6 (0.1 mg/ml), bufuralol (3.4 uM), 
proprietary inhibitor (0.1-10 u.M) and sodium phosphate (100 mM, pH 7.4) in a total 
volume of 0.5 ml were preincubated at 37°C before addition of NADPH (1 mg/ml) and 
incubated further. The reactions were then terminated and filtered as described 
previously before analysis (Example 1). 

Testosterone 6|3-hydroxylase IC50 assay (CYP3A4). Human liver 
microsomes (0.1 \M P450), testosterone (50 uM) and proprietary inhibitors were 
incubated, terminated and filtered as described above (Example 1). Ketoconazole 
was included as a positive control. 

Example 3 

Regression models for human liver microsomal CYP2C9, CYP2D6 and 
CYP3A4. Initially, ICso values for 163 proprietary compounds (run multiple times) 
were generated using the 10 point curve procedure then compared with values 
produced using the 3 point curve (r = 0.99, Figure 1). This naturally led us to 
investigate whether we could reliably predict IC 50 using fewer than 3 points, i.e. a 
single point screen. At an inhibitor concentration of 3 u.M a strong correlation was 
observed between the log 10 (IC 50 ) and 100-percent inhibition for the compounds 
analyzed (r 2 = 0.90, Figure 2). CYP2C9, CYP2D6 and CYP3A4 models all follow the 
same trend at this concentration, in that a linear relationship was observed. Similar 
linear relationships were also demonstrated for log 10 (IC 50 ) and 100-percent inhibition 
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at 1 u.M and at 10 u.M. It is predictable that there should be a correlation between 
logio(IC 50 ) and 100-percent inhibition at 3 uM, because if the Hill function describes 
the dose-response relationship well, then log 10 (IC 50 ) can be expressed as: 
logio(!C 5 o)=log 1 o(3)+(1/h)(log 10 (percent inhibition at 3 jiM) - log 10 (1 00-percent 
inhibition at 3 u.M)). 

Since (100-percent inhibition at 3 ^M) and log 10 (percent inhibition at 3 u.M) - 
log 10 (1 00-percent inhibition at 3 uM) are almost linearly correlated between 20% and 
80%, then using (100-percent inhibition at 3 jxM) as a predictor in a linear model to 
predict logi 0 (lC 50 ) would make a useful model. Regression analysis of IC 50 data 
determined from 10 or 3 inhibitor concentrations was then performed to obtain a 
prediction model together with the associated uncertainties of the predictions in each 
case. All of the initial data at 3 u.M for log 10 (IC 50 ) against 100-percent inhibition is 
shown in Figure 3. This figure shows that the positive control compounds all fall in the 
low IC 50 and low 100-percent inhibition at 3 u.M region (Figure 3). In addition, they 
mostly do not overlap with the proprietary compounds data and the relationship 
between log 10 (IC 50 ) and 100-percent inhibition at 3 ^iM seems to follow a different 
slope for positive control compounds. This is particularly evident with IC 50 values less 
than 0.5 ^M. Therefore separate models were produced for proprietary compounds 
and the positive control compounds. 

Regression analysis of log 10 (IC 50 ) vs. 100-percent inhibition at 10 (xM showed 
that data from CYP2D6 screens follow a statistically different line than data from the 
other 2 screens, so CYP2D6 data was fitted to a different model. This is in contrast to 
the regression analysis of log 10 (IC 50 ) against 100-percent inhibition at 3 jaM which 
showed that data from all 3 screens followed the same line. Regression analysis of 
log 10 (lC5o) vs. 100-percent inhibition at 1 showed that we should use data from all 
the 3 screens to build one model. Figure 4 presents all of these regression models 
along with the data and the 95% prediction intervals for comparison. 

There are three potential models capable of generating one point IC 50 
predictions. Namely, the models using percent inhibition at 1 u.M, 3 uM or 10 u.M. To 
test whether there were significant differences in their abilities to predict IC 50 , we used 
data from all the three screens to fit a single regression for the model using percent 
inhibition at 10 U.M. This model turned out to be very similar to the model using data 
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from the CYP2C9 and CYP3A4 screens. We then used a randomization t-test 
proposed by H. van der Voet (van der Voet, 1994) to compare the predictive nature of 
the three models. We first compared models with percent inhibition at 10 pM and with 
percent inhibition at 3 uM To perform the test we randomly selected 103 data points 
5 from the 163 available data points as a training set, while using the remaining 60 data 
points as a test set. Models were then built using the training set and were followed 
by predicting the logi 0 (IC 50 ) of the test set. The prediction errors were then calculated 
for both models. A null hypothesis (H 0 ) was that the squared prediction errors from 
the two models have the same probability distribution. The alternative hypothesis (HO 

10 used was that the mean squared prediction error from the model with percent 
inhibition at 3 u.M was larger than the mean squared prediction error from the model 
with percent inhibition at 10 u.M. The differences between the squared prediction 
errors between the two models were then calculated using the following equation: d, = 
e.3/ - e.10, 2 ., where / is the index for data in the test set and e.3, and e.10, represent 

15 prediction errors from the model with percent inhibition at 3 jxM and the model with 
percent inhibition at 10 u,M, respectively. The observed statistic was calculated by: 
T 0 bs = mean(d ; ) over the test set. A Monte Carlo procedure was then used to simulate 
the reference distribution of the statistic T under the null hypothesis. We randomly 
assigned signs to d,and then calculated the randomized T by: T = mean(signed d,) 

20 over the test set. 

The above steps were repeated 999 times, each time generating a 
randomized T. These T's provide a simulated distribution of the T under the null 
hypothesis. If the null hypothesis is true, the observed T obs should be in the "fat" part 
of the reference distribution of the T's. If the alternative hypothesis is true, the T obs 

25 would be somewhere in the upper tail of the distribution of T's. We ranked the T obs 
among the T's, and found that T obs was ranked 55 from top. Therefore the p-value for 
this test is 55/1000=0.055, indicating that the difference between the two models is at 
most marginally significant. If we look at the models and the data (Figure 4C) more 
closely, we suspect that the bigger mean squared prediction error for model with 

30 percent inhibition at 3 u.M might be largely due to the four points marked in the figure. 
We investigated these points and concluded that the values for these four points 
were questionable based on the observation of each individual IC 50 plot. These four 
points were therefore removed from our data set and the above procedure repeated 
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to determine the model difference statistic. The new p-value was 0.176, indicating 
that there was no difference between the two models in terms of their ability to predict 
IC 50 . Figure 5 shows the reference distribution of the randomized T based on 1000 
numbers generated from the Monte Carlo simulations under the null hypothesis, and 
5 the observed T value. So on the whole the model with percent inhibition at 3 jaM and 
the model with percent inhibition at 10 uM predicted IC 50 values equally well. Figure 6 
visually compares the predicted log 10 (IC5o) from the regression model with percent 
inhibition at 3 uM (1 point predicted log 10 (IC 50 )) and the log 10 (IC 50 ) values for 
proprietary compounds in the test set which were determined by either 10 point curve 
10 or 3 point curve (10 or 3 point log 10 (IC 50 )). This shows that the model with percent 
inhibition at 3 jiM predicts the IC 50 value well with a single concentration. We then 
used the same procedure to compare models with percent inhibition at 10 uM and 
with percent inhibition at 1 uM. The test concluded that there is a statistically 
significant difference between the two models in terms of their ability to predict IC 50 
1 5 (p-value of 0.001 ). 

Least squares criterion was used to fit the regression model. The model with 
percent inhibition at 10 uM for CYP2D6 is: predicted log 10 (IC 50 ) = -0.2238 + 0.0245 x 
(100-percent inhibition at 10 jj.M). This resulted in an r 2 value of 0.97, p < 0.00001 
and residual standard error s = 0.09. The 95% prediction interval is roughly: predicted 
20 log-ioOCso) ± 2 x 0.09. The model with percent inhibition at 10 u.M for CYP2C9 and 
CYP3A4 is: predicted log 10 (IC 50 ) = -0.0778 + 0.0206 x (100-percent inhibition at 10 p. 
M). This resulted in an r 2 value of 0.91, p < 0.00001 and s = 0.14. The 95% prediction 
interval is approximately: predicted log 10 (IC5o) ± 2 x 0.14. The model with percent 
inhibition at 3 u.M for all three screens is: predicted log 10 (IC 50 ) = -0.5249 + 0.0212 x 
25 (100-percent inhibition at 3 u.M). This resulted in an lvalue of 0.90, p < 0.00001 and 
s = 0.14. The 95% prediction interval is approximately: predicted logi 0 (IC 50 ) ± 2 x 
0.14. This model together with the data and 95% prediction interval is shown in 
Figure 7. 

As observed in Figure 3, the slope for log 10 (IC 50 ) vs. 100-percent inhibition at 
30 3 iM for positive control compounds appears to be different from that for the sample 
proprietary compounds with IC 50 values less than 0.1 ^M. Therefore, a separate 
regression model was generated for positive control compounds. In this case forty 
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data points were available for positive control compounds (Figure 8), where the slope 
of this line is steeper than that for sample proprietary compound data. As with sample 
proprietary compounds, least squares criterion was used to fit the model and no 
difference was found among the three screens. Understandably, due to the fact that 
all the positive control compounds are very potent, the best model in this case is the 
one with percent inhibition at 1 u.M. The equation is: predicted log 10 (IC 50 ) = -1.4585 + 
0.0260 x (100-percent inhibition at 1 uM ), this resulted in an r 2 = 0.88, p < 0.00001 

and s = 0.23. 

Example 4 

Using the same mathematical and statistical techniques described in Example 
3, and letting x represent (100-percent inhibition at 3 u.M), 54 data points from 
CYP1A2 resulted in the following equation: 

log 10 (/C 50 ) = -0.8 146+ 0.0277- x, forx > 19 

with R 2 =0.93, residual standard error s=0.18 and p-value for the regression 
p<0.0001. Figure 9 shows the data and the model. The dotted lines in the figure 
represent the 95% prediction intervals. 

Example 5 

Using the same mathematical and statistical techniques described in Example 
3, 37 data points from CYP2C19 were analyzed and found to have an identical 
equation to that of the CYP2C9 data. Using the same notation as in Example 4, the 
combined data sets of CYP2C19 and CYP2C9 yields the equation: 

log 10 (/C 50 ) = -0.5124 + 0.0219 • x, for x > 9 

with R 2 =0.92, s=0.23, n=117, and p<0.0001. Figure 10 shows the data and the 
model. The dotted lines in the figure represent the 95% prediction intervals. 

Example 6 

Using the same mathematical and statistical techniques described in Example 
3, and using the same notation as in Examples 4 and 5, 175 data points resulted in 
the following equations which demonstrated for recombinant CYP2D6: 
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log 10 (/C 50 ) = -1 . 1 605 + 0.0572 • x, for x < 1 8 
log 10 (/C 50 ) = -0.4603 + 0.01 83 • x, for x > 1 8 

with R 2 =0.85, s=0.26, n=175, and p<0.0001. Two equations were used to describe 
the slopes for the potent and less potent inhibitors. Figure 1 1 shows the data and the 
5 model. The dotted lines in the figure represent the 95% prediction intervals. 

Example 7 

Combined single point regression model for determining CYP inhibition By 

combining the data for CYP1A2, CYP2C19 and rCYP2D6 with the existing data from 
CYP2C9, CYP2D6 and CYP3A4 screens, we can update the regression model for 

1 0 single point IC 5 o estimation for all the drug-drug interaction CYP screens, including 
rCYP2D6. This yields a total of 569 valid data points. The data suggests a different 
slope for very potent compounds (Figure 12), many of which are positive control 
compounds. The cutoff point for the two slopes is at (100 - percent inhibition at 3 yM) 
= 1 7, which was determined statistically as the point that yielded the best result with 

1 5 the smallest residual. This point also corresponds to IC.50 values around 0.5 ~ 0.8 yM. 
Using the notation x to represent the quantity of (100 - percent inhibition at 3 
we can write the new regression models for single point IC50 estimation potent and 
less potent compounds as: 

|log ]0 (/C 50 ) = -1.2919 + 0.0642 -x, for x< 17 
jlog 10 (JC 50 ) = -0.5779 + 0.0222 • x, for x > 17 

The equation forx < 17 is based on 122 data points. For this model, R 2 =0.46, the 
residual standard error s=0.371, and the p-value < 0.0001. The equation for x > 17 is 
based on 447 data points. For this model, R 2 =0.90, the residual standard error 
25 s=0.1 87, and the p-value < 0.0001 . 

Example 8 

Assessment of analytical variability for measuring IC50 values. CYP isoform 
selective inhibitors were used as positive controls to monitor the variability of the 
method over time. The inhibitor positive controls, furafylline, sulfaphenazole, 
30 ticlopidine, quinidine and ketoconazole, were analyzed for CYP1A2, CYP2C9, 
CYP2C19, CYP2D6 and CYP3A4, respectively. Data were then collected on 
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different days (n > 8 inhibition curves). Mean IC50 values for furafylline, 
sulfaphenazole, ticlopidine, quinidine and ketoconazole were 1.195, 0.876, 0.996, 
0.093 and 0.051 U.M, respectively. The inter-assay precision for furalylline, 
sulfaphenazole, ticlopidine, quinidine and ketoconazole was 33%, 26.1%, 40.1%, 
5 25.7% and 41.2% respectively (Table 1). In contrast, the regression model with 

percent inhibition at 3 uM for most proprietary compounds has a prediction standard 
error of s=0. 187 on log 10 scale, which translates into a relative standard deviation 
(RSD) of roughly log(10) x s on the original scale for IC50. So we have: RSD for 
predicting IC 50 = 2.302 x 0.187 = 43%. 

10 

Table 1 . Summary of inhibitor positive control data. 





CYP1A2 


CYP2C9 


CYP2C19 


CYP2D6 


CYP3A4 




Furafylline 


Sulfaphenazole 


Ticlopidine 


Quinidine 


Ketoconazole 




ICso [uM] 


IC 50 [uM] 


IC50 [uM] 


IC 50 [piM] 


IC 50 [uM] 




n = 15 


n = 30 


n = 19 


n = 40 


n = 45 


MEAN 


1.195 


0.876 


0.996 


0.093 


0.051 


S.D. 


0.394 


0.229 


0.40 


0.024 


0.021 


Precision (%) 


33 


26.1 


40.1 


25.7 


41 .2 
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Figure 9 
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Data and model for CYP2C9 and CYP2C19 
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Figure 1 1 
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Figure 12 



